Add Tinker API server for supervised fine-tuning with SkyRL backend#986
Draft
tyler-griggs wants to merge 6 commits intomainfrom
Draft
Add Tinker API server for supervised fine-tuning with SkyRL backend#986tyler-griggs wants to merge 6 commits intomainfrom
tyler-griggs wants to merge 6 commits intomainfrom
Conversation
This enables SkyRLTrainBackend to use AdamParams.learning_rate from Tinker's optim_step() requests, allowing external learning rate schedules. Changes: - Add set_lr() to PolicyWorkerBase and CriticWorkerBase - Add set_lr() dispatch method to WorkerDispatch - Update SkyRLTrainBackend.optim_step() to apply learning rate before stepping - Add GPU tests for set_lr functionality Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Tinker manages learning rate externally via set_lr(), so we disable SkyRL's internal scheduler by setting it to "constant" with no warmup. This prevents conflicts between Tinker's LR schedule and SkyRL's scheduler. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When set_lr() updates the optimizer's param_groups directly, get_lr() needs to read from the same source. Previously get_lr() read from scheduler.get_last_lr() which would return stale values after set_lr(). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This adds the Tinker HTTP API server that enables external training
orchestration (e.g., from tinker-cookbook sl_loop.py) using the
SkyRL-Train backend.
Key changes:
- Added skyrl_train/tinker/ with API server, engine, and SkyRL backend
- Added skyrl_train/tx_utils/ for shared utilities
- Made JAX optional in loss_fns.py (only needed for JAX backend)
- Updated all imports from tx.* to skyrl_train.*
- Fixed engine subprocess path to use skyrl_train module
This enables running supervised fine-tuning with external control:
# Start server:
uv run --extra vllm python -m skyrl_train.tinker.api \
--base-model Qwen/Qwen3-0.6B --backend skyrl_train --port 8001
# Run training from tinker-cookbook:
TINKER_API_KEY=test uv run python -m tinker_cookbook.recipes.sl_loop \
base_url="http://localhost:8001" model_name="Qwen/Qwen3-0.6B"
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds the Tinker HTTP API server that enables external training orchestration using the SkyRL-Train backend. This allows tools like
tinker-cookbookto run supervised fine-tuning (SFT) by making HTTP calls to control training, rather than embedding training logic directly.Key capabilities:
forward_backward()andoptim_step()operationsChanges
New directories:
skyrl_train/tinker/- Tinker API server and engineapi.py- FastAPI server with training endpointsengine.py- Background process handling training requestsbackends/skyrl_train.py- SkyRL backend implementationtypes.py- Type definitions for API contractsloss_fns.py- Loss function implementations (JAX made optional)skyrl_train/tx_utils/- Shared utilitiesgenerator.py,log.py,models.py, etc.Key modifications:
tx.*toskyrl_train.*skyrl_train.tinker.engineArchitecture
Usage
Start the Tinker API server:
Run supervised fine-tuning from tinker-cookbook:
Test Results
Successfully tested with
sl_loop.pyfrom tinker-cookbook:Background
This integration was needed to enable external training orchestration for SFT workloads. The original Tinker code lived in
skyrl-txbut required Flash Attention which had environment issues. Copying intoskyrl-trainallows it to use the working Flash Attention installation and existing SkyRL infrastructure.Next Steps
🤖 Generated with Claude Code